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ABSTRACT 


; It is common to estimate a distribution by means of a step function. 
Such estimates can be made continuous by connecting the points of the 
steps with straight line segments. In this paper the best estimator 
of this class is found for data which is Perera ls distributed using 


minimum risk, then this risk is compared with those of the sample 


distribution function and the Pyke estimator. 
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LT . INTRODUCTION 

In reference C1] » Prof. Read studies the properties of some estimators 
fom a Bet tinuous distribution. In particular , he computes the risk function 
(using the assumption of the uniform distribution) of the estimator that 
connects the points ( aR ) , where x. is a aa smallest observation , 
with straight line segments . This estimator is shown to be better than 
the sample distribution function in an asymptotic sense . It is called 
the Pyke estimator . 

It is natural to ask the question , ™ How well can a distribution be 
estimated by a function that connect the points (X50) with straight line 
segments?" The purpose of this paper is to find the sequence } C__} that 
minimizes the integrated risk function (using the uniform distribution) 


and to compare that risk function with those of the sample distribution 


function and the Pyke estimator. 





IIT DETERMINATION OF THE OPTIMUM COEPFICIUNTS 


Let 1242Cos ne? 6 C44 be an increasing sequence with C. = Q and 


0 


| ie 1 .A continuous function estimetor for F(x) can be defined by 


(2.1) H(x) = ¢, +——=— (C.-C) for 5 X, & x Sx r= 0,.0,0 


re r+ 


where Xo Xx, ay ews ox, are the order statistics of a random semple 

from an absolutely continuous distribution function F(x). It is convenient 
to assume that the population sampled is bounded and contained in a 

finite interval [a,d| .tus, we can define Ky =a, X= bd and the 


risk function (R) by 


, b 2 
(2.2) ee ce F(x) ~ H ar 
f LP@-a@ | ee 


For convenience the interval ea cen be taken to [o,1 . The 
estimator H(x) is defined piecewise according to which of the random 
intervals (XX) contains x . Assuming the data come from a uniform 
distribution The joint p.d.f. of any two order statistics , say x. and. 


34 e e ® —4 « 
; 9 is as easily espressed . Using X.. u and x, 


+ =v the joint 


1 


MeGete iS 





n! 





(?),3) fy x (u,v) = ae a for O4¢réZ@n 

r’? rei (r-1)!(n=r-1)! 
andOLu +41 
fy zt, , (0,v) = n(1 - v)™ for r=0 and u=0 

0’ 1 

_ _ ne for r=n and v=t1 

(2.5) f. x (u,1) = nu 

n” n+l 
Let define 
(2.6) =, = ea) a oe 
90 that 

T= 
(2.7) Cc = A. 
rf 3 
and for simplicity f. x (u,v) = f (u,v) 
r’ r+] 
Then rewriting (2.2) again 
n <u = 
(2.8) R.= 2. | { { | x0, - a,| f (u,v) dudvdx 
r=0 OfcusexEevel vou * a 


Now, the problem is to find the Oe values that give the nininun risk. 
Using classical optinization technique with Lagransian form the problem is to 
n 
Minimize G=R-A( 2D A, -1) 
jo 7 


ay 
Subject to 7 4. =1 
| j=0 





using » for the Lagrange multiplier . Since 





ae Poe" y 
r-Q0 OfuSx4v41 3-0 J a 4. re oun 
n 
See As. 1) 
jzo ° 


The partial derivative of b With respect to 4. is 








ad \\( k-4 x-u x-u 
a, = we De 4, = A. f, (u,v) dudvdx 


j=0 Ver u Vv -u 





n r-1 Xx=—-u x 
rokel 0 4 Solan 2 
for k = 1 9 e « e 9 ne 1; 


The partial derivative of $ with respect to a, and using (2.4) is 


ag x x 
Bes \ \ (x -—4)— n(1 - v)?lavax 
a4, v Vv 





n r—-j4 X=—u 
+ ae x & ZS, -4. )f (u,v) dudvdx 420 


r= j=0 Veu 2 


The partial derivative of @ with respect to A and using (2.5) is 











<.\\\2-Za,-0 | a tans + A= 
oA J 
n 


j=0 ta) tu 2n 


Let 





x-U J ( 
g(u,v) = ae ’ 1 (1) =e f (u,v)dudv 
xev={ 


2 2 
I (g) = (Cae g(u,v)f (u,v)dudv ; I fe )} = if g (u,v)f (u,v)dudv 
x=v=1 x=v=1 


Using these notations , then the system of derivatives is 


ad kei 
(2.10) —= { x1, (g)dx - = A, | 1 (g)ax -4, { 1, (8° Jax + = (a (1)dx 
a4. 0 rakg1 0 7 
n r—] 1 1 » 
- 2 ZA; J T(t )ax - = AL { I(g)ax ++ = 0 
Yo-k+1 j= r=k+1 0 2 
for k= 1 


9 ° © © 9 n—1 


All terms in the above eguations reduce to Beta functions . The relationship 


between Beta and Gamma function 


V'(m)P (2) _ (at) (a1)! 


1 
Fonen) = | vt = alee = 
0 ]* (m+n) (m+n-1}! 


is used heavily. We proceed to evaluate these terms . 





Evaluation of 


. 


n! 


1 sy ce 
: xi, (g)dx = | | ey) SS 


eae) 
ae 1 (. ( veu 
7 are ee \— wh - vt auay 
(k-1)!(meket)! | 3 ee ee 


i v -u 
| (1 ~ ae dudv 





x 1) 4 1 
\5 | fer - vy Nay 


((et)i(met)! | 3 Le 0 


ie 
o— f Hy vy Plays — f vt - Pitan | 


. | es t 1 
( aa eel _ pe. ees f oe av| 


2\k+1 0 ep 


Fhot4 
6(n+1)(n+2) 


" 


(2.11) 
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Evaluation of 


| 


1 { 
f mr (1)dx = | | [ au" two 
0 r 


n! 


er 


Coa) Gree il 


n! 


Seas OIE IS 


(r—1)! (n-r=1)}! 


1 


r+2 0 





r+1 
We. 12) = 
(n+1) (n+2) 
Evaluation of 
1 xXx-U 
J u,(e)ax =| | 
0 V=-u 
n! 


Sed 
—_ 


(k-1)!(n—k-1)! 


a iw) 





nal 


(r—1)!(n-r-1)! 


(1- = al ha { 


= +55 (yeen2yytot 


141 1 
me . f yrt2 ye ya-r-1 ay 
Ze 0 


a f aaa pinta { 


! 
n-k-1 ae 





dudvdx 
(ke4d)!(n—k-1)! 
1 oe ait 
a eel : we ay ecey, 
2 V-u 


-| fx ie mewn, 


1a 





! ie 
5 2am —|- fo yk giyyPlay 
(k-1)!(n-k-1)! |} 24bxk 0 


Wet. | a 
+— f FY (ar) Nay | -~—— f rN tanya | 
k+1 0 


Bi 2.13) = 
2(n+1) 


Bvaluation of 


2 ! 
> 1," )ax -| - } wget uaa 
: Pon (k-1)!(n-k-1)! 


( 3 
ts = Roar 
en): (nek-1)! ;| \- 7 a)? 








ee. 





1-+-v dudv 





7 ee Ke (ev)P Nay 
(k-1)!(n-k-1)! 3 (k f 


, . 
J yetl = vo lay { 


k+i 0 


1 








m.14) = 
3(n+1) 


2 





Evaluation of 


n! 


1 
J 1 (1)ax = { . San dudvax 
0 


(n-r)!(n-r-1)! 


n! 1. 1 | 
(2 Se ee et ay) ot ay 
(n-r)!(n-r-1)! | r 0 | 


1 1 
ee yrtl (tev) 27? lay 


r+i 0 


(2.15) =— 
n+1 


Evaluation of 


1 1 
f I(g)dx is same as evaluation of i I (e)dx 


0 0 


(2.16) {1,(e)ax o 


0 2(n+1) 


The use of these evaluations (2.11), (2.12), (2.13), (2.14), (2.15), 


(2.16) in (2.10), then the derivative with respect to 4 is 


34 | 3k+4 1 k-1 1 


ee 
a - 


OA. 


(2.17) 





oe A 
6(n+2) 2 320 J 3 ‘3 





k 
1 n n r—1 1 on d (n+1) 

+ DB (aiy- > FA,-— & 4,+— 
(n+2) rake rek+1 jso 9 2 reket 2 


13 





3k+4 3(ks2) (k+1) n 
EL. 








= ~ 3C, - 2(C, , — C,, +3(n41) - - 
n+2 s sr Kk) (n42) r=k+1 

-3(1- C44) + 3A(n+1) = 0 

or after evaluation the derivative (2.10) is 
| 2 
n 3k -+6k+2 
(2.18) GF -O 4 +6 DC, = SA (ne) + 3m - 
r=k+1 — n4¢2 


MODs K = tetas oy n= 


The derivative with respect to 4 is . 





f2319) Oo 
| . = ic -—4 @ alan kee dvdx 
JA vy Oo y 
0 
x = 
; z Sf 7 =A, -4 ~}f_(u,v)auavax ,Az 0 


2 
=n] {3 (1-y)27! avdx - 24,| {( 2? (1-7)97 avax 


n n 
“ Da { xr, (1) dx - Ss C | Ty(t)8x - yoo [1,(e)ax +4 0 
r=1 r=] r= 
This can be reduced to 
n 2 
(2.20) -C, +6 2 C= 3A(nH1) + 3n - = for; k=0 
k=1 


n+2 


The derivative with respect to ay is 


(2.21) 2 os x - 
yo tre Nel eta 


3A. yt eul t-u- 








14 





can be reducead to 





ad 2 
es ee -4 - 3A(n+1) = 0 
dB n42 
n 
or 
nD 
(2.22) C= 3A(n+1) ——— for; ken. 
n+2 
so in general 
»2 3k ~46le42 
(2.23) C. =a + 6 Co aS A(n+1) + Gn — 
k-+i hi 
r=-k+1 n+-2 


motes. KE =80, 5, « « , 2 


Using (2.23) , then system of eguations in matrix form is 





moO lw lw Cl wl CCD C, @—- 2/(n + 2) 

Beso 6 . 6 ef 6G CQ C, @—- 11/(n 4 2) 

me> 6 s 2. 8 6 (0 C. @ — 26/(n + 2) 

Uae 6 (wo | 5 6 OC ae, @ - (3n°-6n+2) /(n+2) 
oe. oO 6H 6 C. @ - (3n°—1)/(n+2) 

O -. « « 0 O 1 =T1 Cel he - (3n°46n42)/(n42) | 


where 6 = 3A(n+1) + in 


15 


et ne 





After successive row subractions the matrix is 








mero lU0lCltwtCl el ew COCO C, 9/(n+2) 
[re 0 . . . 0 0 C., 15/(n42) 
merm4 1. . 2 0 CO | Cz | | 2t/(n+2) 

: 
moe, . . 1 4 («1~«O cle | (6n~3) /(n+2) 
woes « 2 O 1 4 1 Sa : 1 (6n+3)/(n+2) 
MOF. 6 hl w(COUCOD SCI i Ona | |e-(3n“s6n42)/(n42) 





To solve this matrix equation is considered the lineer second-order 
ditference esusvion such thet 


(2.24) . C+ 40 +0 


With initi#l1 conditions C. = 0 and C = 1 
0 n+ 1 


ék + 9 





u-+ 2 


The corresponding homogeneous equation of (2.24) is 
(2.25) ana 40 it Co 70 
has the auxiliary equation 


n@ +4m41=0 
The roots of this equation cen be calculated using the quadratic formuls. 
These roots ere A= =2 ry and B= —2 -3 


So the general sotion of (2.25) is siven Ly 


(2.26) C,. = KAS + KB i. Cy 


16 





Xx 
Trial solution of C, is 





k 
% 
| C= Dy + D,k 
Using (2.24) 
a Ea yee then 
0 2(n+2) 1 n+2 
c i+ 2k 
2(n+2) 
SO the general solution is 
k 1+2k 
(2.27) C= = Kas + Kp8 + 2(ne2) 


using initial conditions to determine the values of E, and K, 


C.= 0 in (2.27) implies 


0 
woe eee 
(2.28) Dey to > 7 2(ne2) 
Axi > 1 in (2.27) implies 
(2 29) K ant +4 XK potl _ 
: 1 2 2(n+2) 


Simultaneous solution of (2.28) and (2.29) are 





‘ee potl 1+ antl 
KS OOOOOOOOese—iaanca$s_—C GBC ES CS Ole 
| 2(n42) (A ane _ potl) 2 2(n¢2)(A antl : potl 
SO 
anti-k liatas k 
(2.50) C=. \* ae (ug Bee) = (AU BS) 
(n¢2) panel _ pr+ly 


for; k=0, ..., ni 


7 








then 


(1 = ay(ae* 4 a®y) = (2 - 3y(at* 4 BH 
o(art! : gels at 


‘for; k= 0, . . . « on 
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-LZTI.CALCULATION OF RISK 


From equation (2.8) the risk equation is 


n 2 
(3.1) R= a \ | ( = C.) - (24,)| f (u,v) dudvdx 
_ 4 > 1 > 1 
Zid x I (1) dx = 2c. J xI(1) dx + C.. J I (4 )dx 
A A | 2 | 2 
=O f x1 (g)dx + 2c... _ } 1 te)ax +a | Ig jax 


Using, 


(r+2) (r+1) 
n+5 )(n+2) (ne 





1 
) xI(1) ac. r+i : r =.0, e © eff 
(n+1) (n+2) 
1 
friar eat FEO, 6 6 ayn 
Ca 
0 
1 
4 
f x1 (g)dx = Be hoi eam 
0 6(n+1 ) (n+2) 
fl I (g)ax = L E 
ye 2(ne1) y= Ors eg (2 
va 
fife = sty a 
6 2 3(n+1) 


ee 





Ne) 


(n+1 Ae ot CBE 2 5 rc =n > C 
es: 3(me2) = (n¥2) 7 es Re = 
feo 
2 oe Gay) Oe 
r=0 r—-0 


n 
Calculation of INE 
r-0 


Using (2.31) 


n 
a. = 


r=0 





A 


and using 


n n 
ye eo” + A’) = 2) i 


r—0 r=-0 


N 


n 


va 


r-0 


n 

r 

tours in BT) a a B 
r=-0 


tt 
N 


Similiorly 


n 


Za 


r-0 


n 
(,2{n-r) 7 a2?) -2> aot 
r=0 


t 
N 


20 





2 
n 1 ; 
a . = |(r-s (eee) =({eB (hoo « al 
om aT aT mT eT ) ee) yi + 


antl 


pot 


1 — acinet) 


ee te 






































n n ' 
pate woes ys 2 7 BS Ce 5 
and 
L ee _i-A 
ean? 4 - Be 6( 448) 
1 1 eee: 
eo je 6(1 + 4) 
a ee 1=- B 
14a 1+B 
B 2(n+1) 1 n+i,n44 
(i-A ma. (be A) 
1+ 1 4B 
then 
(3.2) ya 5 > + orl (1 -A)*a> + (1 - B)* ne ae ale 
: r A ee 
r=0 (n+2) (n+2) ocantl _ patty 2 
| (antl A Ee) + 2 
-13" 


n 
Calculation of rm Soe Oh 


Using (2.30) 
1 


n 

C Cys 
? r+1 er 2 
r0 (n+2) 





n 
a \ (x° + 2x +4) + (r bl corare,,, 23 


r-O 


21 





+ (r +S) | (ne) C. ae _4| 








1 
Sd ee 
4 (antl 2 Bas 
here 
> (x? 3 ox nae 2) = n(n+1)(4n414) 4 9(n+1)_ 
r=-0 12 
and 
n , | neiqr oe pori-r )ba n-r ye _er pitt | 
r—-0 u (ane _ antl 4 
5 ‘[,22"3 , BS _ (Enagy (att? 4 Bot?) 2 (241 y peel 
24 (4n+t_po+t 
~(6n+7) (A> + Br) a (ne1) | 
Then 
n 
oa n(n+1)(4n+14) 4+ 9(n+1 
(3.3) DC yCy * _ _n(n+t)(4n414) + 9(n+1) | + |x Beh i. 26 .| 
r=0 12(n+2) nN+2 t<1. 


ea =(6ne4) (antey pet? (e241 ply 6na'7) (AP 4B") 4 8(n- 











24(n+2) aa n+} “aa 


4 


Using (3.2) and (3.3) then 


n+1 
2n“+3n-t V3 (a a +B We ) 
4) (n¢1) R = ————~——— - ame PT 
12(ne2)* 12(n42F a 


1 
an 2a4(ne2)?(att_gty > | ens ca" 4B) + 40 (n+1) 


Mamie aren) 4 6(nei) (Act! 4 hah | 


22 








IV.COMPARISONS AND SUMLLARY 
From (3.17) page 14 of Prof. Read's paper 
| 
la 1 | 
(n+1)°R(C) = i (n—1)x(1—x)dx + 2(n+1) f \ (1-x)b(x) + n(1-x)} ax 
0 0 


where 





> 
t! 
| + 
bad 
a 
PS 
t 
a 
ee 
5, 
c 


and h(z) = / 
n+] 0 


y -1 1 
(n+1)°R(C) -— + 4(n+1) i (1-x)h(x)dx 
0 


and eveluation of 


5a 
X= 


1 1 
f (exn(zjax = f (-z) f ' ua 
0 0 0 Vieu 





q 


= ie (2) =) agus 





OLu<x~I 1=-u 
let 
x=. 
i es OSy=1 
1=—u 
then 
dx 
xSue¢(i-eujy , 1-x= (1 -u)(1—y) and dy= 
1=-u 


So 
n—1 1 





(4.1) (n¢1)R(C) = _ ——_—— 
6(n+1)  3(n+2)(n41) 
where 


R(C) is Pyke risk 


23 





Risk of sample cumulative distribution function R(SDF) is 





1d 1 
R(SDF) =— J x(1-x)ax = — 
n 6n. 
then 
~ Dt 
(4.2) (n+1)R(SDF) = 
60 


Optimum risk R(OPT} is derived in this paper is 


on*+3n=t 13 antl ptt ny 
4.3 (n41)R(OPL) = ———- + er 
7 ‘ 12(n+2) ae (UE eta ) 


(2ne3) (AP 4B) 48(net) CAD Bet! aon (abe Bh*)440(ne1 ) 


DAE ee ne 


In this eguation last two terms may be omitted.Since they are close to 
zero for n= 6 


30 


on“43n—| 





(n+1)R(OPT) = 5 
12(n+2) 


Cumputer output is showm on page 25 forn=1, . . .« , 60. Hquations 


-(4.1) , (4.2) and (4.3) are used. 
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A 


(N+1)R(C) 


LAST TWO TERMS 
(N+1)R(OPT) 


(N+1)RCOPT) 


(N+1) R(SCF) 


N 


OF 


STaIMOMsTeOLeQoOnsAmMtOnNnarnnomaontromemooaonvor DAIPSTIANMOKOCORMMAHOOOMN 
OMe AY FF DOO MOG OND FUNG SF SFO HN NOW DIAM AON NTN DOOM SFA RNDOCONNAMOHESOMMALOGOSORIN 
QO MOM TANOINNS OAM OM DOP CNOOME FAHOCOCNTR AINDSTONAOTHOINNORINM AOR OFAN HOGE 
ONT NOL UND SO OU MF ODS HONAL AM AO OM ONS MNAAMODDTOORKK OOOOMMNMNNGE EtG sem 
PAC O RNS SOOM OWS TM MMANNN Net det dere OOS OCD CODD OCOOOCOOCOUOMOOOD 
CO ESONN AMAIA OSOOSCSOHOOOCOSOSSHSSHSSOSSSOSOOS S$ HSS GO GOSGSSOGOGOOOOOGGOGO 


BOSC SSSSSTOSSOSSOOOOSOSSOHOOOHSHHSSSHSSSGSSSHHGS SSG OG SSGOOOSOGHGGOOOGOOOOS 
COSESEECEE©OSOSSOOSSOOSOSOOOSCOOSOSSOSSS SS SS COSOCOSGOSGOGCOCOOGOCOODOOO 
Se ee ee Oe oo Oe OO 8 OS Oe Sele 6 6 8 6 06 6 eles ce 66 oe (6 6 ree ot ere 
tei hhh dhohed hod ohed ol ol alelele} lelelelelelelelelelelelaleleiolalolellolelelololelalelelolalelelel torte 
ee ee ee eC (eee fe) Teer Tn Ten rrr Sta 


mn ehadeded todo ahopesosotosojojolejolalololalelalelalelelalollolololelololalolelaetotelalolololoteltetertetete 
COLNAGO ST NOM OAS TOWORNERONDOOMOO ROMAN TOR SRN OOMNG OUORKRMmamum 
ODOM AOAO DION OOS SAE NIE OLR OOTONDDOR DHT NNOMNOSTOOS FOS SR DODDODOGINEIN 
PAA LIT ON OOO ATF NOOLTHOOOR SH OMOOFOSORNMOME DHA AHODNNODMONGOANINODGO 
RN TAT SAN OA DORM ANAS MANE MBNOMNOLFONOANNDAHMNODOMNAKGONSAEOOUASTINOE OO 


Wm PLPODECOOOOSOOOSSN SOO SSOOSCSOS SOOO DOOD OCOC OOD ODD OOO OD COOCOCOCCOCO 
AIA OM DIAN OE NOT NE OF DODDENOARMTRONODMNONNNODDNAMAORDOMMEENOTOOOM 


SR ee St 2 8 © DO 60 010. 40, 9959 90 © 0 OONORe © 6 © 600 6% 0 0 6 6 enetelone cine ore fee 


eth ohad ad oolololelslolelolelelalalolelelelolelalolelelslelololelolelelelelelelelaleleloleretetein inte 


TMNT OE ORONO SON OE ODOANMTINOE DROGNMFNOK ODORNIM TOR EDOHNMYE NOL ODO 
TA tt IOAN NINA NNN NM MMM MMMM MM SSPE te FEST GMINNIMMNMNNINN DS 
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After computer calculations forn=1,... » 60 (4.1) , (4.2) and 


(4.3) is showm in (4.4) and (4.5) 


(n+1)R 


03D 


«50 


025 


020 


016 


015 


610 


£05 


(n+1)R(C) 


Z_—(n+1)R(OPT) 


10 20 30 40 


(4 4) 
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rp eed = 


60 





-030. 


2025 


020 


0015 


010 


~005 





(4.5) 
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